Design Version Control System

Last Updated: December 19, 2025

Ashish

Ashish Pratap Singh

hard

In this chapter, we will explore the low-level design of a simplified version control system.

Let's start by clarifying the requirements:

1. Clarifying Requirements

Before starting the design, it's important to ask thoughtful questions to uncover hidden assumptions, clarify ambiguities, and define the system's scope more precisely.

Here is an example of how a conversation between the candidate and the interviewer might unfold:

After gathering the details, we can summarize the key system requirements.

1.1 Functional Requirements

  • Support creation and versioning of multiple files within a hierarchical directory structure.
  • Allow users to commit changes across the entire repository (no staging area).
  • Support basic branching operations (create and switch branches).
  • Maintain a commit history for each branch.
  • Allow users to rollback to any previous commit in the history.
  • Store full snapshots of the file system at the time of each commit.
  • Enable viewing the commit history of the repository.

1.2 Non-Functional Requirements

  • Modularity: The system should be designed with clear separation between modules.
  • Maintainability: Code should follow object-oriented principles, be easy to test, and allow for future changes with minimal impact
  • Reliability: Ensure consistency and correctness of the file system across commits, branches, and rollbacks.
  • Usability: Provide a simple interface to demonstrate core operations such as commit, checkout, branch, and revert

2. Identifying Core Entities

Core entities are the fundamental building blocks of our system. We identify them by analyzing the functional requirements and highlighting the key nouns and responsibilities that naturally map to object-oriented abstractions such as classes, enums, or interfaces.

Let’s walk through the functional requirements and extract the relevant entities:

1. We need to model files and folders in a directory tree.

A version control system manages codebases, which are nothing more than directories containing files and subdirectories. To represent this structure:

  • We introduce a common abstraction FileSystemNode, which acts as the base class for anything that can exist in the repository, either a File or a Directory.
  • The File class represents a single file with its name and contents.
  • The Directory class can contain multiple FileSystemNode children, enabling a tree-like structure to model folders and subfolders.

2. We need to take a snapshot of the file system every time the user commits.

Each time a user commits, the system needs to save the entire state of the file system. This is where the Commit class comes in.

  • A Commit object stores a full snapshot of the repository at a specific point in time.
  • It contains metadata like a unique ID, timestamp, commit message, and a reference to the root Directory of that snapshot.

To manage and retrieve these commits efficiently:

  • The CommitManager acts as a registry for all commits. It handles commit creation and lookup operations.

3. We need to support branches and maintain separate commit histories.

Branches allow developers to work in isolated timelines. Each branch has its own set of commits.

  • The Branch class represents a line of development. It keeps track of its name and points to the latest commit (HEAD).
  • The BranchManager is responsible for creating new branches, switching between them, and maintaining all the branches in the system.

This setup ensures that multiple versions of the project can be developed independently.

4. We need a central engine to coordinate everything—commits, branches, and file states.

To tie everything together, we need a top-level controller that understands the current state and user operations:

  • The VersionControlSystem class plays this role. It manages the active branch, handles operations like commit, checkout, and revert, and interfaces with both the CommitManager and BranchManager.

This class is the main entry point for any core version control operation.

5. We need a simple way to demonstrate how the system works.

For testing and demonstration purposes:

  • The VersionControlSystemDemo class runs a predefined sequence of operations (like creating files, committing, switching branches) to showcase how the system behaves.

This helps in validating the logic without building a full-fledged CLI.

These core entities define the key abstractions of a version control system and will guide the structure of your low-level design and class diagrams.

3. Designing Classes and Relationships

In this section, we outline the core classes involved in the design of a lightweight, in-memory version control system.

3.1 Class Definitions

Commit

Represents a snapshot of the entire file system at a given point in time.

Commit

CommitManager

Responsible for creating and storing Commit objects.

CommitManager

Branch

Represents a named pointer to a chain of commits.

Branch

BranchManager

Manages multiple branches using a Map<branchName, Branch>.

BranchManager

VersionControlSystem

The central controller class that exposes the public API of the version control system.

VersionControlSystem

3.2 Key Design Patterns

Composite Pattern

Problem it Solves: The system needs to manage a file system, which is a tree-like structure containing both individual files (leaves) and directories (containers/branches of the tree). We need a way to treat both individual objects and compositions of objects uniformly.

FileSystemNode

How it's Applied:

  • We define a common abstract class or interface, FileSystemNode.
  • File is a "leaf" class that implements FileSystemNode.
  • Directory is a "composite" class that implements FileSystemNode and holds a collection of other FileSystemNodes (which can be either Files or other Directorys).

Benefits:

  • Simplicity: Client code (like the commit logic) doesn't need to differentiate between files and directories. It can simply call a method like root.clone() and the entire tree is cloned recursively, regardless of its structure.
  • Extensibility: Adding new types of nodes (e.g., SymbolicLink) is easy; you just create a new class that extends FileSystemNode.

Where used: DirectoryNode and FileNode both inherit from a common Node base, and directories hold children of type Node.

Why: Lets you treat individual files and folders uniformly and walk the tree recursively.

Prototype Pattern

Problem it Solves: The requirement is to store "full snapshots" of the file system for each commit. This means we need to create an independent, deep copy of the entire workingDirectory tree every time a commit is made. Manually iterating and creating new objects would be complex and error-prone.

  • Where used: clone() methods on FileNode and DirectoryNode to create deep copies of the filesystem snapshot.
  • Why: Decouples snapshot creation from concrete classes and makes “make a copy” a first-class operation.

How it's Applied:

  • The FileSystemNode abstract class defines a clone() method.
  • Both File and Directory provide concrete implementations of clone(). The Directory's clone() method recursively calls clone() on all its children, ensuring a deep copy is made.

Benefits:

  • Encapsulates Complexity: The logic for creating a complete copy is contained within the objects themselves, not in a separate manager class.
  • Performance: While our implementation is simple, in more complex scenarios, cloning can be more efficient than creating a new object from scratch.

Facade Pattern

Problem it Solves: The internal workings of the VCS are complex. There are Commit objects, Branch pointers, a map of all historical commits, and the workingDirectory. A user shouldn't have to interact with all these components directly. They need a simple, high-level API.

How it's Applied:

  • The VersionControlSystem class acts as the facade. It provides a clean and simple interface with methods like commit(), checkoutBranch(), log(), and revert().
  • It hides the complexity of creating Commit objects, managing the branches map, deep-cloning the workingDirectory, and traversing the commit history.

Benefits:

  • Decoupling: The client code (the Main driver class) is completely decoupled from the system's internal implementation.
  • Usability: It makes the system much easier to use and understand from an external perspective.

Memento Pattern

  • Where used: Commit holds a snapshot (the memento) of the entire DirectoryNode.
  • Why: Encapsulates the internal state of the file tree so you can restore it later (e.g. on revert or checkout).

3.3 Full Class Diagram

Version Control System Class Diagram

4. Implementation

4.1 FileSystemNode

An abstract base class representing a node in the file system.

1class FileSystemNode(ABC):
2    def __init__(self, name: str):
3        self.name = name
4
5    def get_name(self) -> str:
6        return self.name
7
8    @abstractmethod
9    def clone(self) -> 'FileSystemNode':
10        pass
11
12    @abstractmethod
13    def print(self, indent: str):
14        pass

It defines a common interface for files and directories, including methods for cloning (Prototype Pattern) and printing the structure.

4.2 File

A concrete subclass of FileSystemNode representing a file.

1class File(FileSystemNode):
2    def __init__(self, name: str, content: str):
3        super().__init__(name)
4        self.content = content
5
6    def get_content(self) -> str:
7        return self.content
8
9    def set_content(self, content: str):
10        self.content = content
11
12    def clone(self) -> 'FileSystemNode':
13        return File(self.name, self.content)
14
15    def print(self, indent: str):
16        print(f"{indent}- {self.name} (File)")

It stores file content and supports deep cloning of file state for snapshotting during commits.

4.3 Directory

A concrete subclass of FileSystemNode representing a directory.

1class Directory(FileSystemNode):
2    def __init__(self, name: str):
3        super().__init__(name)
4        self.children: Dict[str, FileSystemNode] = {}
5
6    def add_child(self, node: FileSystemNode):
7        self.children[node.get_name()] = node
8
9    def get_child(self, name: str) -> Optional[FileSystemNode]:
10        return self.children.get(name)
11
12    def get_children(self) -> Dict[str, FileSystemNode]:
13        return self.children
14
15    def clone(self) -> 'FileSystemNode':
16        new_dir = Directory(self.name)
17        for child in self.children.values():
18            new_dir.add_child(child.clone())
19        return new_dir
20
21    def print(self, indent: str):
22        print(f"{indent}+ {self.name} (Directory)")
23        for child in self.children.values():
24            child.print(indent + "  ")

It can contain other FileSystemNode instances (files or subdirectories), supports recursive cloning, and enables hierarchical structure.

4.4 Commit

Represents a single commit in the version control system.

1class Commit:
2    def __init__(self, author: str, message: str, parent: Optional['Commit'], root_snapshot: Directory):
3        self.id = str(uuid.uuid4())[:8]
4        self.author = author
5        self.message = message
6        self.parent = parent
7        self.root_snapshot = root_snapshot
8        self.timestamp = datetime.now()
9
10    def get_id(self) -> str:
11        return self.id
12
13    def get_message(self) -> str:
14        return self.message
15
16    def get_author(self) -> str:
17        return self.author
18
19    def get_timestamp(self) -> datetime:
20        return self.timestamp
21
22    def get_parent(self) -> Optional['Commit']:
23        return self.parent
24
25    def get_root_snapshot(self) -> Directory:
26        return self.root_snapshot

It captures a snapshot of the file system (Directory), the commit metadata (author, message, timestamp), and a reference to its parent commit, forming a chain of history.

4.5 CommitManager

Handles the creation and retrieval of commits.

1class CommitManager:
2    def __init__(self):
3        self.commits: Dict[str, Commit] = {}
4
5    def create_commit(self, author: str, message: str, parent: Optional[Commit], root_snapshot: Directory) -> Commit:
6        new_commit = Commit(author, message, parent, root_snapshot)
7        self.commits[new_commit.get_id()] = new_commit
8        return new_commit
9
10    def get_commit(self, commit_id: str) -> Optional[Commit]:
11        return self.commits.get(commit_id)
12
13    def print_history(self, head_commit: Optional[Commit]):
14        if head_commit is None:
15            print("No commits in history.")
16            return
17
18        current = head_commit
19        while current is not None:
20            print(f"Commit: {current.get_id()}")
21            print(f"Author: {current.get_author()}")
22            print(f"Date: {current.get_timestamp()}")
23            print(f"Message: {current.get_message()}")
24            print("--------------------")
25            current = current.get_parent()

Maintains a map of all commit IDs and provides functionality to print the commit history starting from a specific commit.

4.6 Branch

Represents a branch in the version control system.

1class Branch:
2    def __init__(self, name: str, head: Commit):
3        self.name = name
4        self.head = head
5
6    def get_name(self) -> str:
7        return self.name
8
9    def get_head(self) -> Commit:
10        return self.head
11
12    def set_head(self, head: Commit):
13        self.head = head

Each branch has a name and a reference to its latest commit (the head).

4.7 BranchManager

Manages all branches in the system.

1class BranchManager:
2    def __init__(self, initial_commit: Commit):
3        self.branches: Dict[str, Branch] = {}
4        main_branch = Branch("main", initial_commit)
5        self.branches["main"] = main_branch
6        self.current_branch = main_branch
7
8    def create_branch(self, name: str, head: Commit):
9        if name in self.branches:
10            print(f"Error: Branch '{name}' already exists.")
11            return
12        new_branch = Branch(name, head)
13        self.branches[name] = new_branch
14        print(f"Created branch '{name}'.")
15
16    def switch_branch(self, name: str) -> bool:
17        if name not in self.branches:
18            print(f"Error: Branch '{name}' not found.")
19            return False
20        self.current_branch = self.branches[name]
21        print(f"Switched to branch '{name}'.")
22        return True
23
24    def update_head(self, new_head: Commit):
25        self.current_branch.set_head(new_head)
26
27    def get_current_branch(self) -> Branch:
28        return self.current_branch

Supports creating new branches, switching between them, and updating the head commit of the current branch.

4.8 VersionControlSystem

The central singleton class that coordinates all components.

1class VersionControlSystem:
2    _instance = None
3
4    def __new__(cls):
5        if cls._instance is None:
6            cls._instance = super().__new__(cls)
7        return cls._instance
8
9    def __init__(self):
10        if hasattr(self, '_initialized'):
11            return
12        self._initialized = True
13        self.commit_manager = CommitManager()
14        self.working_directory = Directory("root")
15        initial_commit = self.commit_manager.create_commit("system", "Initial commit", None, self.working_directory.clone())
16        self.branch_manager = BranchManager(initial_commit)
17
18    @classmethod
19    def get_instance(cls):
20        if cls._instance is None:
21            cls._instance = cls()
22        return cls._instance
23
24    def get_working_directory(self) -> Directory:
25        return self.working_directory
26
27    def commit(self, author: str, message: str) -> str:
28        parent_commit = self.branch_manager.get_current_branch().get_head()
29        snapshot = self.working_directory.clone()
30
31        new_commit = self.commit_manager.create_commit(author, message, parent_commit, snapshot)
32        self.branch_manager.update_head(new_commit)
33
34        print(f"Committed {new_commit.get_id()} to branch {self.branch_manager.get_current_branch().get_name()}")
35        return new_commit.get_id()
36
37    def create_branch(self, name: str):
38        head = self.branch_manager.get_current_branch().get_head()
39        self.branch_manager.create_branch(name, head)
40
41    def checkout_branch(self, name: str):
42        success = self.branch_manager.switch_branch(name)
43        if success:
44            new_head = self.branch_manager.get_current_branch().get_head()
45            self.working_directory = new_head.get_root_snapshot().clone()
46
47    def revert(self, commit_id: str):
48        target_commit = self.commit_manager.get_commit(commit_id)
49        if target_commit is None:
50            print(f"Error: Commit '{commit_id}' not found.")
51            return
52        self.working_directory = target_commit.get_root_snapshot().clone()
53        self.branch_manager.update_head(target_commit)
54
55        print(f"Repository state reverted to commit {commit_id}")
56
57    def log(self):
58        print(f"\n--- Commit History for branch '{self.branch_manager.get_current_branch().get_name()}' ---")
59        head_commit = self.branch_manager.get_current_branch().get_head()
60        self.commit_manager.print_history(head_commit)
61
62    def print_current_state(self):
63        print("\n--- Current Working Directory State ---")
64        self.working_directory.print("")

It manages the working directory, delegates commit and branch operations, and supports key VCS operations like commit, revert, branch creation, checkout, and logging.

4.9 VersionControlSystemDemo

A demonstration class that simulates interactions with the version control system.

1class VersionControlSystemDemo:
2    @staticmethod
3    def main():
4        print("Initializing Version Control System...")
5        vcs = VersionControlSystem.get_instance()
6
7        # --- Initial State on 'main' branch ---
8        vcs.print_current_state()
9
10        # --- First Commit ---
11        print("\n1. Making initial changes and committing...")
12        root = vcs.get_working_directory()
13        root.add_child(File("README.md", "This is a simple VCS."))
14        src_dir = Directory("src")
15        root.add_child(src_dir)
16        src_dir.add_child(File("Main.java", "public class Main {}"))
17        first_commit_id = vcs.commit("Alice", "Add README and initial source structure")
18        vcs.print_current_state()
19
20        # --- Second Commit ---
21        print("\n2. Modifying a file and committing again...")
22        readme = root.get_child("README.md")
23        readme.set_content("This is an in-memory version control system.")
24        second_commit_id = vcs.commit("Alice", "Update README documentation")
25        vcs.print_current_state()
26
27        # --- View History ---
28        vcs.log()
29
30        # --- Branching ---
31        print("\n3. Creating a new branch 'feature/add-tests'...")
32        vcs.create_branch("feature/add-tests")
33        vcs.checkout_branch("feature/add-tests")
34
35        print("\n4. Working on the new branch...")
36        test_dir = Directory("tests")
37        root.add_child(test_dir)
38        test_dir.add_child(File("VCS_Test.java", "import org.junit.Test;"))
39        feature_commit_id = vcs.commit("Bob", "Add test directory and initial test file")
40        vcs.print_current_state()
41
42        # --- View history on feature branch ---
43        vcs.log()
44
45        # --- Switch back to main ---
46        print("\n5. Switching back to 'main' branch...")
47        vcs.checkout_branch("main")
48        # Notice the 'tests' directory is gone, as it only exists on the feature branch.
49        vcs.print_current_state()
50        vcs.log()  # Log shows only main branch history
51
52        # --- Reverting ---
53        print("\n6. Reverting 'main' branch to the first commit...")
54        vcs.revert(first_commit_id)
55        vcs.print_current_state()  # The README content is back to its original state.
56
57        # --- View history after revert ---
58        print("\nHistory of 'main' after reverting:")
59        vcs.log()  # The head is now the first commit
60
61
62if __name__ == "__main__":
63    VersionControlSystemDemo.main()

It walks through scenarios such as making commits, branching, switching branches, and reverting to previous commits.

5. Run and Test

Files9
entities
filesystem
managers
version_control_system_demo.py
main
version_control_system.py
version_control_system_demo.pymain
Output

6. Quiz

Design Version Control System - Quiz

1 / 19
Multiple Choice

Which core entity is most responsible for capturing the entire state of the repository at a specific point in time?

How helpful was this article?

Comments


0/2000

No comments yet. Be the first to comment!

Copilot extension content script